On the Dirichlet Prior and Bayesian Regularization

نویسندگان

  • Harald Steck
  • Tommi S. Jaakkola
چکیده

Motivation & Previous Work: A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from high-throughput data sources. Regularization is essential when learning from finite data sets. It provides not only smoother estimates of the model parameters compared to maximum likelihood but also guides the selection of model structures. In the Bayesian approach, regularization is achieved by specifying a prior distribution over the parameters and subsequently averaging over the posterior distribution. In domains comprising discrete variables with a multinomial distribution, the Dirichlet distribution is the most commonly used prior over the parameters. This is because of the following two reasons: first, the Dirichlet distribution is the conjugate prior to the multinomial distribution and hence permits analytical calculations, and second, the Dirichlet prior is intimately tied to the desirable likelihood-equivalence property of network structures [1, 3]. The so-called equivalent sample size measures the strength of the prior belief. In [3], it was pointed out that a very strong prior belief can degrade predictive accuracy of the learned model due to severe regularization of the parameter estimates. In contrast to that, the dependence of the learned network structure on the prior strength has not received much attention in the literature, despite its relevance for the recovery of the ”true” network structure underlying the data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introducing of Dirichlet process prior in the Nonparametric Bayesian models frame work

Statistical models are utilized to learn about the mechanism that the data are generating from it. Often it is assumed that the random variables y_i,i=1,…,n ,are samples from the probability distribution F which is belong to a parametric distributions class. However, in practice, a parametric model may be inappropriate to describe the data. In this settings, the parametric assumption could be r...

متن کامل

Bayesian shrinkage

Penalized regression methods, such as L1 regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimension...

متن کامل

Characterizing the Function Space for Bayesian Kernel Models

Kernel methods have been very popular in the machine learning literature in the last ten years, mainly in the context of Tikhonov regularization algorithms. In this paper we study a coherent Bayesian kernel model based on an integral operator defined as the convolution of a kernel with a signed measure. Priors on the random signed measures correspond to prior distributions on the functions mapp...

متن کامل

Bayesian Multi-Task Compressive Sensing with Dirichlet Process Priors

Compressive sensing (CS) is an emerging field that, under appropriate conditions, can significantly reduce the number of measurements required for a given signal. Specifically, if the m-dimensional signal u is sparse in an orthonormal basis represented by the m × m matrix Ψ, then one may infer u based on n m projection measurements. If u = Ψθ, where θ are the sparse coefficients in basis Ψ, the...

متن کامل

Posterior Consistency of the Silverman g-prior in Bayesian Model Choice

Kernel supervised learning methods can be unified by utilizing the tools from regularization theory. The duality between regularization and prior leads to interpreting regularization methods in terms of maximum a posteriori estimation and has motivated Bayesian interpretations of kernel methods. In this paper we pursue a Bayesian interpretation of sparsity in the kernel setting by making use of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002